Remove all nemo2 imports from old repo by oyilmaz-nvidia · Pull Request #628 · NVIDIA-NeMo/Export-Deploy

oyilmaz-nvidia · 2026-03-03T23:45:12Z

No description provided.

… dynamic inference - Add nemo_deploy/llm/inference/nemo_utils.py which vendors standalone NeMo utilities (MCoreTokenizerWrappper, ckpt path helpers, constants) with no dependency on the nemo package, and re-exports the complex NeMo types (GPTConfig, T5Config, io, set_modelopt_spec_if_exists_in_ckpt) under a single HAVE_NEMO guard. - Remove direct from nemo.* imports from inference_base.py and tron_utils.py; both files now import from the local nemo_utils module instead. - Fix AttributeError in create_mcore_engine: GPTInferenceWrapper was called with (model, inference_context) but the deployed Megatron-LM API expects (model, inference_wrapper_config, inference_context). Add InferenceWrapperConfig built from model.config attributes; MCoreEngine then internally creates a DynamicInferenceContext and switches to DynamicInferenceEngine. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Fix import ordering in test_inference_base.py (ruff I001) - Remove direct nemo imports from inference_base.py, nemo_utils.py, tron_utils.py - Add nemo_io.py with standalone load_context implementation - Remove HAVE_NEMO guard checks now that nemo is no longer a static dependency - Update tests to remove HAVE_NEMO patches and use types.SimpleNamespace

- Remove unused StaticInferenceContext import - Use inner model config for hidden_size/params_dtype instead of outer model - Add buffer_size_gb param to create_mcore_engine and MegatronLLMDeployable Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…linting

copy-pr-bot · 2026-03-03T23:45:16Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

oyilmaz-nvidia and others added 4 commits March 3, 2026 16:41

Merge branch 'remove-direct-nemo-imports-in-inference' into fix/ruff-…

a37a149

…linting

oyilmaz-nvidia requested review from athitten, meatybobby and pthombre as code owners March 3, 2026 23:45

github-actions bot added deploy LLM export tests labels Mar 3, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove all nemo2 imports from old repo#628

Remove all nemo2 imports from old repo#628
oyilmaz-nvidia wants to merge 4 commits intomainfrom
fix/ruff-linting

oyilmaz-nvidia commented Mar 3, 2026

Uh oh!

copy-pr-bot bot commented Mar 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

oyilmaz-nvidia commented Mar 3, 2026

Uh oh!

copy-pr-bot bot commented Mar 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant